hum tee dum tee#43
Open
adiled wants to merge 20 commits into
Open
Conversation
New hum-paths crate. init() sets XDG defaults at startup; every callsite routes through it. No /tmp fallback. Socket moves to $XDG_STATE_HOME/hum.
Issue #41 root cause was Arc<Mutex<Option<Child>>> + try_lock in claude-cli — wait task held the lock forever, kill closure's try_lock always failed, SIGKILL never reached the child. - nest::lifecycle::supervise(AsyncGroupChild) using tokio::select! over child.wait() vs CancellationToken. command-group for tree-kill so claude's descendants die too. Reaps with timeout. - Cell.kill: Arc<dyn Fn> -> Cell.cancel: CancellationToken. - Drop on CellBundle cancels on map removal; idle reaper + LRU evict + map clear all kill correctly. - claude-cli + claude-repl + mock + nest::pool + serve.rs migrated. - lru::LruCache replaces HashMap + linear-scan eviction (O(1)). - nest::metrics swapped /proc parser for sysinfo (cross-platform RSS/CPU). - metrics + metrics-exporter-prometheus on humd; /metrics on 127.0.0.1:9909 (HUM_METRICS_ADDR override). Counters at evict + kill sites; gauge for active cells. - governor token bucket on thrumd accept loop (100/s). - Reconnect jitter on serve_worker reconnect to spread thundering herds. - 3 lifecycle tests: cancel kills, natural exit propagates, tree-kill takes grandchild.
DaemonConfig::from_env() (and a few other early callers) hit hum_paths functions before any init() ran. Tests panicked. Making xdg() call init() on first miss makes init() optional everywhere; explicit init() at bin entry stays as an eager warmup so child processes inherit the env.
Dead modules removed (1500 LOC): - nest::pool (Nest, never used) - nest::mock (only used by pool tests) - nest::health (tiered eviction policy, no callers) - nest::budget (token/tool-call caps, no callers) - nest::Listener + nest::ForagerBee traits (Listener impl in serve.rs was empty stubs; ForagerBee had zero impls) Wired: - nest::metrics::spawn_sampler now drives a hum_cell_rss_bytes / hum_cell_cpu_ms gauge labelled by pid from inside lifecycle::supervise. Per-cell observability stops being a half-built feature. - humd.metricsAddr config knob (hum.json) replaces the hardcoded 127.0.0.1:9909. Schema entry added. Naming standardized: hum_cells_active -> hum_cell_count. Dead code removed in hives/common/src/serve.rs (HashMap import, WireListener Listener impl), humd/src/lib.rs (HumdSink.cli_path), hum/src/main.rs (home() helper). Tests added: - serve::tests::drop_cancels_token — RAII fires on drop - serve::tests::lru_pop_drops_bundle_and_cancels — LRU eviction kills - serve::tests::map_clear_cancels_all — shutdown kills everything - humd::prometheus_endpoint — /metrics actually serves - thrumd::accept_rate_limit — governor paces accepts at quota EOF
…urged, flaky test budgeted - nest::ResourceLimits now actually applied: claude-cli calls apply_pre_exec on the std::process::Command (via cmd.as_std_mut()) before group_spawn. SpawnSpec.resource_limits stops being decorative. - CatalogueSlot.sid: stored but never read. Field + setter param dropped. - humfs::tools::read::is_code_file: production-unused fn with tests testing nothing the codebase uses. Both purged. - partition_then_heal_converges_wane: budget raised 1s -> 10s so the test is no longer load-flaky under parallel runs. cargo check + cargo check --tests: zero warnings.
Rendezvous file (P1 #40 fix): - hum_paths::RuntimeInfo + HumnestRuntimeInfo (atomic write, read, remove). - thrumd::serve_with_hook fires on_bound after UnixListener::bind. humd uses it to publish runtime.json with socket + pid + version + bound_at_ms. - hum_paths::thrum_sock_resolved prefers env > runtime.json > default. Daemons keep thrum_sock; bees/CLI use _resolved. Socket-path drift is gone — clients always reach whatever humd actually bound. - hum doctor connect-tests with a real hello tone, 1s timeout. exists() check replaced. doctor surfaces humnest runtime.json too. - hum bee --list flags "⚠ crash-looping (exit N)" via new svc_last_exit reading systemctl ExecMainStatus / launchctl 'last exit code'. humnest crate — bee supervisor sibling daemon: - Reads humnest.bees[] from hum.json. - Each bee spawned via nest::lifecycle::supervise (group-kill, RAII). - Restart policy per bee: always | on-failure (max_retries, backoff) | never. - Crash-loop state surfaced through humnest-list RPC. - Control socket at humnest.sock (NDJSON): humnest-spawn|humnest-kill|humnest-list. - Sibling to humd: humd crash != humnest crash, bees stay alive. humctl crate — service-manager 0.7 wrapper: - humctl {install|start|stop|restart|status|uninstall} {humd|humnest}. - Pure Rust, ServiceLevel::User, cross-platform via service-manager crate. - scripts/svc.sh DELETED. 300 lines of bash, gone. - ./install + every hives/*/install rewritten to call humctl, drop svc.sh sourcing. Hive install now appends to hum.json humnest.bees + restarts humnest instead of generating per-bee systemd/launchd units. config gained humnest.bees[] schema. hum CLI: new 'hum nest' subcommand lists humnest bees with state + restart count. 'hum bee enter|exit|reenter' routes through humnest first for kinds in hum.json, falls back to legacy svc paths for unknown / 'all' targets. 247 tests pass.
Same priority order as hum_paths::thrum_sock_resolved on the Rust side: HUM_THRUM_SOCK > runtime.json > computed default. Closes the last gap where non-Rust clients would miss humd's published socket path after restart.
humctl shrinks to humd-only operator (no install/uninstall — bootstrap
owns service registration). New verbs: start, stop, restart, status,
logs, health. Status shells systemctl/launchctl native; health does a
real connect + hello/breath probe through hum_paths::thrum_sock_resolved.
hum_paths::macos_log is replaced by daemon_logs(name) returning a typed
DaemonLogs { Journald{unit} | Files{stdout,stderr} } so callers dispatch
on the enum instead of #[cfg]-ing per platform. humctl logs and hum's
print_recent_logs now share the same path.
install bash generates humd.service + humnest.service inline (Linux) or
humd.plist + humnest.plist (macOS), coupling expressed in the unit
files themselves (Wants= / PartOf= / RunAtLoad). svc.sh stays deleted.
SIGHUP reload handler I had added to humnest is reverted — it was a
coarse 'reconcile against disk' plaster. surgical RPC (humnest-spawn /
humnest-kill via the existing control socket) is the right primitive
and hum_route_verb already uses it.
humnest is gone. orchd (sibling Rust project; user-scope systemd +
launchd platform contributed upstream this round) replaces it.
Architecture now:
launchd/systemd
└── humd (one user service; humctl operates; ./install registers)
orchd
└── per-bee user units (one each, generated from per-hive Orchfile)
Hive surface: each hive ships an Orchfile at its root. `hum hive install
<target>` resolves the target, builds (cargo install --path), copies the
Orchfile into ~/.config/hum/orch.d/, re-assembles ~/.config/hum/Orchfile,
and runs `orchd up <kind>`. No more per-hive install bash scripts.
What dies:
- humnest crate (entire thing — 4 modules, supervisor + control + log + lib)
- hum_paths::humnest_sock, humnest_runtime, HumnestRuntimeInfo
- config::HumnestSection, BeeConfig (orch's Orchfile owns this declaration)
- hum.schema.json humnest section
- 4 hives/*/install bash scripts (claude-cli, claude-repl, humfs, paid-oracle)
- ./install humnest unit generation (no second user unit)
What lives:
- humd, humctl, hum CLI: unchanged in role
- humctl: humd operator only (start/stop/status/logs/health)
- orch + orchd: pulled as git deps by ./install via cargo install
What's new:
- hives/{claude-cli,claude-repl,humfs,paid-oracle}/Orchfile (declarative)
- hum CLI: hive_install does build + register-via-Orchfile + orchd up
- hum CLI: bee enter/exit/reenter routes through orchd up/down/restart
- hum CLI: 'hum nest' delegates to 'orchd status'
- ./install pulls orch + orchd from github via cargo install
openai-server (TS) still has its bash install — TS build pipeline
(pnpm install + tsup) not yet automated through hum hive install. Follow-up.
247 tests pass.
Detects build kind by marker file:
- Cargo.toml → cargo install --path
- package.json → pnpm/npm install + build; writes ~/.local/bin/<kind>
as a node wrapper exec'ing the produced dist/index.js
(honors pre-built dist if no pnpm/npm in PATH)
- go.mod → go build -o ~/.local/bin/<kind>
- 'build' script → execute it (escape hatch for exotic hives)
Orchfiles added for the remaining hives (every hive now ships one):
- bp7, grpc, gsm-modem, ollama-server (Rust foragers; bp7-forager,
grpc-forager, ollama-server, gsm-modem binaries)
- openai-server, anthropic-server, vercel-ai (TS HTTP foragers)
- twilio-sms (Go forager)
12 hives total, all surface-uniform: ~/.local/bin/<kind> + Orchfile.
hives/openai-server/install bash deleted — the build pipeline (pnpm
install + tsup build + node wrapper) now lives in hum CLI's
build_node(). One code path for all TS hives.
247 tests pass.
Constitution-level rename. Where hum owns the names, they sing. SpawnSpec → Egg (the thing-to-be a worker raises) WorkerBee::spawn → ::raise (workers raise brood) Cell.pid → Cell.mark (beekeeper's mark) Cell.stdin → Cell.feed (nurses feed larvae through open cells) Cell.events → Cell.mmm (the sounds from inside the cell) Cell.exited → Cell.emerged (adult bee chews out the cap and emerges) Cell.cancel → Cell.silence + Cell.still() (token field, verb method) ResourceLimits → Bounds resource_limits→ bounds CellMetrics → Vitals Attachment → Pollen (what a forager carries home) encode_prompt_with_attachments → encode_prompt_with_pollen nest::lifecycle::supervise → ::tend (nurse bees tend brood cells) Foreign types stay foreign (CancellationToken, mpsc::Sender, AsyncGroupChild, tokio::process::Command, oneshot::Receiver). We don't own those words. Our surface either reads like apiary biology or stays out of the way. 247 tests pass.
ids/HumId newtype replaces stringly-typed identifiers everywhere hum
mints them. Foreign formats are projections via deterministic transforms.
ids crate
- HumId(String) newtype with Serialize/Deserialize/Display/FromStr/AsRef<str>
- HumId::mint() — ts-prefixed random; sessions, requests, calls
- HumId::from_hash([u8; 32]) — pure hash encoding
- HumId::from_foreign(&str) — deterministic projection; same input → same id
- to_uuid_v5(NS_CLAUDE_SESSION) — outgoing UUID for claude --session-id
- NS_CLAUDE_SESSION preserves the historical HUM_SESSION_NS bytes so
existing claude transcripts stay reachable
Type signature changes (internal)
- nest::Egg.sid: HumId (was String)
- nest::Egg.session_id REMOVED (derived from sid.to_uuid_v5 inside claude-cli)
- nest::Egg.fresh: bool ADDED (resume vs --session-id flag for the worker)
Mint sites canonicalized
- thrumd connection id (cid) — HumId::mint() (was uuid::Uuid::new_v4)
- thrum_core::rid() returns HumId-format (was tsBase36-counterBase36)
- hives/common/mcp_bridge call_id — thrum_core::rid() (no more 'call-' prefix)
- all hive hello rids: bp7, grpc, gsm-modem, paid-oracle, ollama-server, ensemble
- ollama-server per-request sid — HumId::mint()
- all sim test cid/rid fixtures
Boundary
- hives/common/serve.rs canonicalizes incoming wire sid:
HumId::parse(&s).unwrap_or_else(|_| HumId::from_foreign(&s))
- dead HUM_SESSION_NS + sid_to_session helper removed
hum-paths bonus
- config::denied() uses hum_paths::config_dir() instead of '~/.config/hum'
literal (now correct under XDG_CONFIG_HOME override)
- hum CLI module doc adds 'hum nest' + 'hum update'; drops stale
'Inspection-only for 0.3'; fixes bees.json comment
Untouched (by design)
- Hid (humd/bee identity, sha256(pubkey) hex with role prefix) — different beast
- Wire types stay String; humd canonicalizes at entry handlers
- kad query rids (derived from query_id for cross-hop correlation)
- Cryptographic nonce in paid-oracle
Every humd maintains a signed append-only NDJSON ring of every chi event
it observes. The log is the only authoritative store of activity;
bees.json, sid -> bee routes, session state are projections via
thehum.replay().
New crate: thehum
- Event: chi + sid + rid + body + author hid + seq + ts_ms + prev_hash + sig
- HumId::mint() rids; humd.key signs; sha256 hash chain
- TheHum::open / append / tail / range / replay / snapshot / enforce_retention / anchor
- Retention modes: archive, rolling, light (configurable per humd)
- Snapshots: Merkle root over BTreeMap of state leaves, emitted as
chi:snapshot into own log
- AnchorBackend trait + EvmAnchor scaffold for on-chain commitments
(production users wire their own signer)
- Sorted-key canonical JSON for hashing+signing; sig stripped from
canonical bytes so chain integrity is sig-agnostic
- 24 unit tests + 7 end-to-end tests pass
Integration
- humd opens TheHum from humd_key + hum_paths::thehum_dir() at boot
- Replays the log to rebuild bee manifests (pure handler; reads
event.ts_ms not now())
- Appends every incoming tone before routing (chi log -> handler chain)
- Spawns 30s snapshot+retention background task
- New chi:Backfill handler responds with thehum.range(author, from)
- Backfill enum variant added to thrum-core/chi.rs
Determinism scrub
- humd: nestler_id fallback uses client_id (was SystemTime ms)
- humd: cwd fallback is "/" (was env::var HOME) — replay must not read env
- humd: sorted iteration in worker selection + forager tool catalogue
during chi:Prompt handler (was HashMap order)
Deletions
- hums crate (vestigial; was Hums::load() with discarded result)
- hum_paths::hums_json (no callers)
New surfaces
- hum thehum {status|tail|range|verify|replay}
- humctl thehum (health check, no daemon needed)
- thehum::layout module — single source of truth for seq.bin / snapshots/
/ root.txt names (no literal filenames outside thehum crate)
Tests: 275 passed across the workspace (was 247).
Promote every path construction + every literal basename across the
workspace into hum-paths. Single source of truth for everything hum
reads or writes; nothing constructs a hum filename outside this crate.
New constants (canonical basenames):
THRUM_SOCK_BASENAME, HTTP_SOCK_BASENAME, PENNY_BASENAME,
HUMD_KEY_BASENAME, BEES_SNAPSHOT_BASENAME, RUNTIME_INFO_BASENAME,
HUM_JSON_BASENAME, PEERS_JSON_BASENAME, ORCHFILE_BASENAME,
HIVES_SUBDIR, RECIPES_SUBDIR, HIVE_INSTALL_SCRIPT
New helpers (composed paths):
home(), expand_tilde(p) — single tilde-expander for user config
local_dir(), local_bin_dir() — $HOME/.local + .local/bin
hum_bin(name) — installed-binary location for a hum binary
fnm_node_bin() — fnm-managed node fallback
claude_data_dir(), claude_session_dir(cwd_hash) — claude CLI layout
orch_d_dir(), orchfile() — orchd integration files
foreign_hive_cache(org, repo, branch) — github-source clone cache
svc_script() — scripts/svc.sh shipped with the source clone
Migration:
- config::expand_tilde delegates to hum_paths::expand_tilde
- hives/{claude-cli,claude-repl} propagate HOME via hum_paths::home()
- claude-cli/graft uses claude_session_dir(cwd_hash)
- hum CLI's home_local() / hum_orchfile() helpers deleted; callers use
hum_paths::{local_dir, local_bin_dir, hum_bin, orchfile, orch_d_dir,
foreign_hive_cache, svc_script, HIVES_SUBDIR, RECIPES_SUBDIR,
HIVE_INSTALL_SCRIPT, ORCHFILE_BASENAME}
- sim, penny, humd/peers tests reach for hum_paths::*_BASENAME and
hum_paths::peers_json() instead of literal filenames
- humd/peers test uses hum_paths::config_dir() after XDG_CONFIG_HOME
override (was reconstructing 'hum' subdir manually)
The only HOME / XDG_* reads left in the workspace:
- hum-paths itself (the source of truth)
- doctor diagnostics (reports raw env values to the user)
- test isolation (set_var XDG_* on tmp dirs)
All routine code reaches for hum_paths instead.
275 tests pass.
b25ddc3 killed scripts/svc.sh but the CLI kept calling its functions through dead bash shell-outs (svc_helper, svc_active, svc_last_exit, svc_start/stop/restart, svc_list, svc_uninstall, svc_status). Removing all of it. Deletions: - svc_helper() — scripts/svc.sh discovery - svc_active(), svc_last_exit() — bash exit-code probes - bee_list(svc) — bash svc_list scraper - resolve_units() — unit name resolver only used by the bash path - hum_paths::svc_script() — no callers after this commit - All bash shell-outs to svc_start/stop/restart/uninstall/status Rewrites: - bee_list_full now takes a Vec<String> from orch_catalog() (installed hive kinds), not svc-discovered unit names. State printed as 'orchd-managed' / 'unmanaged' / 'installed, not handshaked' based on presence in humd's bees.json + orch catalog. - bee() routes every verb through orch_route_verb. No bash fallback. - hive_list() marks running kinds from orch_catalog() (no svc_list). - uninstall() calls 'humctl stop' instead of bash svc_uninstall. - status() drops the trailing 'svc_status hum' bash call (humctl status is the canonical surface now). 275 tests pass.
thehum::layout module merged into hum-paths (single source of truth):
pub const THEHUM_SEQ_BASENAME / THEHUM_SNAPSHOTS_SUBDIR /
THEHUM_ROOT_BASENAME / THEHUM_NDJSON_EXT
pub fn thehum_seq_file / thehum_snapshots_dir / thehum_root_file
hum + humctl + thehum-internal callers updated.
Compiler-flagged dead code (zero remaining):
- removed gsm-modem::now_ms (never used after the hello-rid migration
to ids::HumId::mint)
- removed humd's stale "see TODO in nest::pool::Nest" comment — pool
module was deleted in the orchd adoption
Stale TS-era references rewritten:
- claude-repl module doc dropped "Real behavior in TS lives in
harness.ts" (Rust IS the implementation now); unused FSM variants
Hunting/Wilting/Hushed deleted
- ollama-server `images: Option<Vec<String>>` field deleted (was an
#[allow(dead_code)] placeholder for a feature that never landed)
- hives/common/serve module doc points at current Cell API (cell.mmm,
cell.still, raise instead of spawn)
- thrum-core envelope/prim docs: drop "TS daemon" / "TS wire shape"
framing
- thrumd::thrum_broadcast: drop "matches the TS daemon's routing"
- drone::Health: drop "the TS Assessment strings"
- claude-cli/graft: drop "TS writer" / "like the TS does"
- ids tests: drop "lib/id.ts encodeBase32" reference
- penny load test: drop "TS shape" wording
Test pass: 275.
Drove from rustc's unreachable_pub lint with RUSTFLAGS=-W unreachable_pub across the workspace. 68 items downgraded: ensemble/kad.rs 6 internal kad helpers (ParsedFindNode*, etc) thrumd/conn.rs 1 thrumd/registry.rs 7 internal sigil-broadcast types config/lib.rs 7 defaults::* helpers humd/peers.rs 1 load() called only from boot hives/common/mcp_bridge.rs 8 test-side reqwest_lite helpers hives/humfs (ast/* + tools/* + dispatch.rs) 38 intra-crate types hum-paths was excluded by design — every helper there is meant to be called from anywhere in the workspace, current consumers or not. 275 tests still pass.
paths.rs holds every "../thrum-clients/{ts,python,go}/..." literal.
Both `cargo run -p codegen` and `thrum-core/build.rs` route through
codegen::paths instead of carrying their own copies, so a rename can't
leave one site stale.
While here: --check now covers all three targets (ts, python, go) and
no-arg `cargo run -p codegen` regenerates all of them. The previous
CLI silently dropped python and go even though build.rs emitted them.
The library had every piece. humd was only plugged into the outbound
half, and only over TCP. Two real humds could not meet.
humd/src/peer_transport/ now owns the daemon-side plumbing as two
sibling modules:
iroh.rs bind(humd_key) returns an IrohTransport whose NodeId is
pinned to the persistent HumdKey, so the signed hello
verifies against the iroh-routed identity. dial_all walks
peers.json for iroh: hints; spawn_listener detaches an
accept loop. Both paths land at Ensemble::install (signed).
tcp.rs Sibling shape. spawn_listener binds humd.tcpListen and
accepts; dial_all walks tcp: hints. Plaintext NDJSON; the
signed hello is the only authentication on the wire.
Surrounding changes that came up while finishing this:
ensemble::IrohTransport::bind_direct_with_key(&HumdKey)
New constructor. Pins iroh's SecretKey to our HumdKey so NodeId
== pubkey and Hid == sha256(pubkey) collapse to one identity.
ensemble drainer re-keys peers on Verified hello
TCP accept() returns a Hid::random_humd() placeholder because
plain TCP can't authenticate the peer pre-handshake. The drainer
was rejecting every inbound signed hello on `claimed_id == id`.
Verified hellos now move the registry entry from the placeholder
to the cryptographically-verified id. iroh path unchanged (its
NodeId matches claimed_id by construction). Unsigned hellos still
require id match, since the sig is what makes re-key safe.
humd::identity::read_key
Read-only loader so `hum ensemble` can show the daemon's id
without minting one as a side effect.
config: humd.tcpListen
Optional "host:port" in hum.json. Omitted = dial-only over TCP.
iroh is independent of this setting (always tried).
RuntimeInfo.ensemble_addrs
Populated with iroh: + iroh-ip: (and tcp: when listening) hints.
A peer copies these into its peers.json to dial this humd back.
hum CLI: ensemble subcommand
`hum ensemble` show me + reach + configured peers
`hum ensemble peer add ...` append entry to peers.json (atomic)
`hum ensemble peer rm ...` drop by humd_id or alias
Tests:
peer_transport::iroh::tests real iroh QUIC round-trip, both
sides end with each other in
Ensemble::peers().
peer_transport::tcp::tests same shape, exercises the re-key
path on the accept side.
46 ensemble tests, 277 workspace tests still green.
What's still cold but unblocked: live peer state in `hum ensemble`
needs an admin-tone RPC into the running daemon. Today the CLI reads
on-disk artifacts only. Kad lookup is fed by install but no caller
yet queries it. TLS transport is library-only. Iroh's relay/WAN bind
path exists but humd boots with bind_direct (loopback/LAN only).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.